Constant-Time Markov Tracking for Sequential POMDPs
نویسنده
چکیده
Introduction Stochastic representations of a problem domain can capture aspects that are otherwise difficult to model, such as errors, alternative outcomes of actions, and uncertainty about the world. Markov models (Bellman 1957) are a popular choice for constructing stochastic representations. The classic MDP, however, does not account for uncertainty in the process state. Often this state is known only indirectly through sensors or tests. To account for the state uncertainty, MDPs have been extended to partially observable MDPs (POMDPs) (Lovejoy 1991; Monahan 1982; Washington 1996). this model, the underlying process is an MDP, but rather than supplying exact state information, the process produces one of a set of observations. The major drawback of POMDPs is that finding an optimal plan is computationally intractable for realistic problems. We are inter~ted in seeing what limitations may be imposed that allow on-line computation. We have developed the approach of Markov Tracking (Washington 1997), which chooses actions that coordinate with a process rather than influence its behavior. It uses the POMDP model to follow the agent’s state, and reacts optimally to it. In this paper we discuss the application of Markov Tracking to a subclass of problems called sequential POAfDPs. Within this class the optimal action is not only computed locally, but in constant time, allowing true on-line performance with large-scale problems.
منابع مشابه
A Particle Filtering Algorithm for Interactive POMDPs
Interactive POMDP (I-POMDP) is a stochastic optimization framework for sequential planning in multiagent settings. It represents a direct generalization of POMDPs to multiagent cases. Expectedly, I-POMDPs also suffer from a high computational complexity, thereby motivating approximation schemes. In this paper, we propose using a particle filtering algorithm for approximating the I-POMDP belief ...
متن کاملStochastic Local Search for POMDP Controllers
The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their relatively low computational cost. In this paper, we illustrate a basic problem with gradient-based methods applied to POMDPs, where the sequential nature of the decision problem is at issue, and propose a new stochast...
متن کاملHilbert Space Embeddings of PSRs
Many problems in machine learning and artificial intelligence involve discrete-time partially observable nonlinear dynamical systems. If the observations are discrete, then Hidden Markov Models (HMMs) (Rabiner, 1989) or, in the control setting, Partially Observable Markov Decision Processes (POMDPs) (Sondik, 1971) can be used to represent belief as a discrete distribution over latent states. Pr...
متن کاملA Framework for Optimal Sequential Planning in Multiagent Settings
Introduction Research in autonomous agent planning is gradually moving from single-agent environments to those populated by multiple agents. In single-agent sequential environments, partially observable Markov decision processes (POMDPs) provide a principled approach for planning under uncertainty. They improve on classical planning by not only modeling the inherent non-determinism of the probl...
متن کاملBelieving in POMDPs
Partially observable Markov decision processes (POMDP) are well-suited for realizing sequential decision making capabilities that respect uncertainty in Companion systems that are to naturally interact with and assist human users. Unfortunately, their complexity prohibits modeling the entire Companion system as a POMDP. We therefore propose an approach that makes use of abstraction to enable em...
متن کامل